

# AN ENERGY EFFICIENT CONDITIONAL BOOSTING FLIPFLOP FOR HIGH PERFORMANCE SENSE AMPLIFICATION APPLICATIONS

<sup>1</sup> Y. SIREESHA PG STUDENT IN DEPT OF ECE IN SREE VAHINI INSTITUTE OF SCIENCE & TECHNOLOGY IN TIRUVURU, ANDHRA PRADESH, 521235.

<sup>2</sup> Dr. R. SRIDEVI PROFESSOR AND HOD IN DEPT OF ECE IN SREE VAHINI INSTITUTE OF SCIENCE & TECHNOLOGY IN TIRUVURU, ANDHRA PRADESH, 521235.

Abstract: One of the major challenges in modern VLSI design is power consumption, right up there with space and performance. Digital systems rely on the flip-flop. In sub-threshold operation, we examine and contrast four different flipflop topologies: IP-DCO, MHLFF, CPSFF, and CPFF. Both pulse-triggered and conditional approaches are included in topologies. Verv low these power consumption applications are now within reach, thanks to sub threshold technology. One advantage of this technique is that it decreases the number of power-hungry flipflops. Compared to a strong inversion circuit, a subthreshold circuit consumes less power while running at the same frequency. Tanner uses 18nm technology in cmos for design. We test the flip-flops' power delay, power delay product, and average power at a 1V power supply voltage and look at them from every perspective.

Keywords: Sub Threshold Technology, Flip Flop, Low Power.

# **1. INTRODUCTION**

To satisfy the need for electronic systems to function at high speeds, clock frequency increases and tighter timing requirements are required [1]. These systems have used high-speed circuits, which consume a lot of power, to satisfy the required timing standards. The need for computations that are energy efficient has grown in recent times due to the widespread use of mobile electronic devices in daily life [2, 3]. Energy constraints may be addressed using low-power design which solutions, may involve sacrificing speed performance. These tactics include voltage scaling to decrease switching power and operations to prevent conditional redundant power [4, 5]. Making a conscious effort to reduce power consumption in order to achieve quicker processing speeds is standard practice high-performance for mobile applications. Digitally integrated

systems that are in sync use flip-flops and latches to control state transitions and synchronised data flow. Since flipflops are often used in timing-critical signal pathways, which dictate the maximum working frequencies, their high-speed design is vital [8]. With millions of flip-flops used by a single processor and their combined power consumption exceeding 20-40% of the total power, low-power flip-flop design has become more important in recent years [9], [10], [11]. Since it is not possible to simultaneously reduce power consumption and delay caused by flip-flops and latches, this becomes an essential consideration in the design of high-speed mobile electronic systems. With its master-slave architecture, the transmission-gate flipflop (TGFF) [12] (FIGURE 1) may provide synchronous digital integrated circuits minimal power consumption data-to-output (DQ) and latency. Another potential benefit of TGFF's reliable operation in the near-threshold voltage (NTV) range is power savings via voltage scaling. By dividing the sampling and capturing of input data into the master and slave phases, respectively, so-called pulse-based techniques in TGFF may decrease the DQ latency [13], [14], [15]. A pulse generator and one latching step are

combined in the transmission-gate pulsed latch (TGPL) [13], an example of such an implementation. When the master stage in TGFF is removed and the input data is sent to the output within a tiny pulse period induced by the clock edge, the DQ latency is reduced. Due to the circuit overhead needed to create the brief pulse, TGPL may use a significant amount of power despite its quick working. Problematic operation, especially in the NTV zone, may also the result from width's pulse unpredictability due to process factors. Though there have been significant improvements in pulse generation using novel circuit techniques, total power consumption is still greater than with TGFF when pulsed processes need internally delayed local clocks [14], [15]. The employment of the senseamplifier-based flip-flop (SAFF) method is another approach to increasing the speed of timing components [17]. Fast sampling and input data capture at the triggering clock edge are achieved by the flip-flop via the use of a symmetric latch and a differential precharged circuit in the first stage, respectively, allowing for high-speed operation. Altering the latching stage's architecture may increase power and speed even more, but it can also lead to undesirable signal fighting, which compromises latency and power consumption [18]. Additionally, these flip-flops are vulnerable to increasing variability in the NTV area since they use a weak shorting mechanism to guarantee static functioning. Despite the fact that the issue may be resolved by monitoring the arrival of precharged nodes, there are significant power and latency overheads. This work introduces conditional bridging flip-flops (CBFFs), which are based on sensing amplifiers and may improve speed while lowering power consumption.





Problems with the previously mentioned shorting device are eliminated by the suggested conditional bridging method, which does not incur any power or speed overheads. A single-ended version (CBFF- **JNAO** Vol. 15, Issue. 2 : 2024

S) of the flip-flop is suggested for decreasing power consumption, while a differential version (CBFF-D) is suggested for lowering latency. In addition to being fast, low-power, and contention-free, CBFFs can reliably operate in the NTV zone.

# 2. EXISTING METHOD

#### 2.1. SENSE-AMPLIFIER-BASED

#### FLIP-FLOP

### A. CONDITIONAL BRIDGING

A conditional bridging approach is suggested as a more power-efficient way to fix problems with the reducing device (M4) in traditional SAFFs. Motivated by the idea that the shorting device should only be activated when D changes after being collected by Q, the goal is to eradicate any relevant redundant transitions. In other situations, it is more prudent to disable the device in order to avoid the drawbacks of using an ineffective device and to avoid the unnecessary discharge of an internal node (X or Y) on the opposite branch. Figure shows the SA stage with a conditional bridging circuit, where the shorting device is driven by the circuit's output (CBG). The suggested conditional bridge circuit monitors the values of D, DB, SB, & RB to switch on M4 when CK=1

only when D changes & becomes different from Q. Because SB & RB are pre-charged high when CK is low, activating M13, M17, and maybe one of M12 & M16, keeping CBG low is independent of the D value. Depending on the value of D, SB or RB discharge at the rising edge of the clock. Assuming SB is discharged, keeping CBG low by M16 – M17 is possible with D=RB=1. As soon as D goes low, CBG rises high via M14 and M15, enabling the activation of M4 to supply a DC route to the ground and guaranteeing static functioning.





# side open is another option.

## **B. STRUCTURE AND OPERATION**

Two variants of the conditional-bridging flip-flop (CBFF) are suggested, both of which use the conditional bridging mechanism that was previously discussed. A sensing amplifier stage (M0-M9 and I0), a conditional bridging circuit (M11-M16) and a single-ended latch stage (M18- M23, I1, & I2) make up the single-ended version (CBFF-S), as shown in Figure 5. A reduction in the overall number of transistors in the flip-flop is achieved by modifying the conditional bridging circuit. More specifically, D drives M11's sources and DB's sources directly, whereas M15's sources are driven via DB. In Figure 5, the latch merges M21 and M17, which were controlled by RB in Figure 4, are shown. The glitch- & contention-free single-ended latch, powered by the SA stage without inversion (as indicated on the right portion of Fig.), allows the latching stage to be optimised for power consumption and device count. M18 uses just RB to pull-up QN following the rising clock edge, whereas M19-M21 uses only RB to pulldown QN. The purpose of inserting M20, which is driven by D, is to remove QN glitches caused by the precharged high value of RB at the beginning of the clock high frequency. In order to draw down QN without contention, SB is employed to drive the source of M22. Node A is likewise linked to the source of M23 so that QN may be pulled up without conflict. The lack of a pulsed operation distinguishes the latching stage of CBFF-S shown in Fig. from that of conventional pulsed latches [13], [14], [15]. Choose CBFF-S if you care about

operational reliability, latency, and power

**JNAO** Vol. 15, Issue. 2 : 2024

usage. One way to avoid unnecessary CBG transitions is to control when the conditional bridging logic engages the shorting device (M4). At low switching activity, the conditional bridging circuit will minimise power consumption because it transitions when D changes after Q grabs D during CK=1. The circuit allows for miniaturisation by reducing power consumption and preventing shorting. To save power, CBFF-S only releases the opposing precharge node (X or Y) if D changes, as D seldom changes under low input switching activity. Regular SAFFs charge and discharge them on a clock cycle, as previously mentioned. The smallest possible shorting device might potentially decrease latency and draw out timingcritical impulses like SB and RB quicker due to its smaller parasitic capacitance. Disabling the shorting device allows for faster input sampling by eliminating SB-RB signal interference. Through direct control of the latching process, RB eliminates signal inversion and contention, hence decreasing clock-to-output (CQ) time. Pulling down pre charged nodes reliably at low supply voltages is made possible by reducing congestion in the SA stage. CBFF-S's contention-free latching stage & conditional-bridging SA stage ensure input data is securely gathered while the system operates steadily in the highly volatile NTV zone. Presented here is the differential

**JNAO** Vol. 15, Issue. 2 : 2024 suggested flip-flop, often known as CBFF-D. Figure 13 shows two transistors, M13 and M30, that may be combined in the conditional bridging circuit without the need to add a third transistor because of the symmetric differential structure. When we add a few transistors after latching and remove the output inverter (I2 in Fig.), we can tell SB and RB to drive difference outputs Q, QB. Putting CK-powered M24 in parallel to pull-up keepers transistors M22 and M25 speeds up output pull-down by stopping them from struggling. While a setup comparable to the single-ended variant (M22 powered by SB in Fig) performs well at normal supply voltages, the inclusion of a delayed transistor (M24) is necessitated due to reliability concerns highlighted by Monte-Carlo simulations in the worst corners. In terms of operation speed and power consumption, CBFF-D is identical to its single-ended predecessor. Despite having a slightly higher overall power consumption owing to the bigger load capacitance of CK utilised to operate the differential latch, CBFF-D greatly decreases power usage in low-activity switching scenarios with its conditional bridging function. Due to the SA the stage's outputs driving Q and QB different latches, CBFF-D can outperform CBFF-S in terms of speed.





# 3. <u>PROPOSED FLIP FLOP</u> <u>DESIGNS</u>

# 3.1 CONDITIONAL BOOSTING AMPLIFIER

To improve its performance in collecting input data, a conditional boosting flip-flop uses conditional boosting methods. It is a kind of flip-flop circuit. Each input and output signal's logic state determines which boosting operations this flip-flop applies. In order to optimise its functioning for diverse input data circumstances, the conditional boosting flip-flop integrates outputdependent presetting and intake-dependent boosting concepts. This method minimises power usage during regular operation while allowing for speedier data gathering when needed.

Capacitor terminals N and NB preset voltages are defined by outputs Q and QB to enable output-dependent presetting, as shown in Figure 4(a). The left-hand figure in Figure 4(a) shows that N is set to low and NB is set to high when Q is low & QB is high. On the flip side, N is set to high and NB to low when Q is low and QB is high (also referred to diagram in Fig. 4(a)). As seen in Figure 4(b), a nMOS transistor connects the noninverting input (D) to NB for input-dependent boosting, and another nMOS transistor links the inverting input (DB) to N. As seen on the left side of Figure 4(a), capacitor presetting may occur if the flip-flop stores low data. In this scenario, as seen in the top left figure of Figure 4(b)), a high input pulls NB to ground, which in turn boosts N towards -VDD via capacitive coupling. On the other hand, as shown in the bottom left schematic of Figure 4(b)), a low input would normally link N to ground. However, because to the node's setting to VSS, virtually no voltage change would occur at NB, resulting in no boosting. In the alternative case, as shown in the right-hand 4(a)—capacitor diagram of Figure presetting occurs when the flip-flop stores high data-a low input pulls N to ground, which in turn boosts NB towards -VDD via capacitive coupling-as shown in the bottom right-hand diagram of Figure 4(b).





|                                 | input (D) | output (Q) | boosting node (N) | boosting node (NB) |
|---------------------------------|-----------|------------|-------------------|--------------------|
| output-                         | -         | VSS        | VSS               | VDD                |
| presetting                      |           | VDD        | VDD               | VSS                |
| input-<br>dependent<br>boosting | D=VDD     | VSS        | VSS 🗲 –VDD        | VDD 🗲 VSS          |
|                                 |           | VDD        | VSS               | VDD                |
|                                 | D=VSS     | VSS        | VDD               | VSS                |
|                                 |           | VDD        | VDD 🗲 VSS         | VSS → –VDD         |

# Fig 5: Findings for various Q and QB out puts

An explicit short pulse generator, a symmetric latch, and a conditional-boosting differential stage make up the system. Figure 6(a) shows the conditional-boosting differential stage in action. For outputdependent presetting, we use MP5, MP6, and MP7, and MN8 and MN9. For inputdependent boosting, we use MN5, MN6, and MN7 in conjunction with the boosting capacitor CBOOT. Figure 6(b) shows the symmetric latch, which consists of MP8-MP13 and MN10-MN15. A unique explicit pulse generator, as illustrated in Figure 6(c), is used to create a short pulsed signal PS, which is used to activate specific transistors inside the differential stage. The suggested pulse generator is different from traditional ones since it does not use a pMOS keeper. This leads to faster processing times and lower power consumption because signal fighting is eliminated during the pull-down of PSB. When introduced in tandem with MN1, MP1 plays the job of the keeper by helping to quickly bring down PSB and

keeping its logic value high. During the rising edge of CLK, MN1, MP1, & I1 quickly discharge PSB, which causes PS to become high. After I2 and I3 have elapsed, MP2 charges PSB, which causes PS to go back to low and causes a short positive pulse at PS, the width of which is dictated by the latency of I2 & I3. Even though MP2 is not doing anything during CLK's low phase, MP1 is holding PSB high. Our analysis shows that with the same pulse widths and slew rates, reduced energy consumption of up to 9% are possible.



Fig:6. Presented as a possibilities CBFF.
(a) The threshold for conditionalboosting differentials. (b) Latch is symmetrical. (c) A generator of explicit short pulses. Delay, which is dependent on parasitic capacitances, and other power consumption issues may be circumvented by removing them. The figure displays the variance in the delay time of flip-flops. Based on the graph, it is evident that MHLFF has a lower latency than other flip-flops. Figure displays the typical amount of electricity that a flip-flop consumes. Among our four flip-flop designs, CPSFF has the lowest average power usage while MHLFF has the most. Compares the power delay product. The changes in the average power for different kinds of flip-flops. Compared to other flip flops, CPSFF has a lesser value. Among these four flip-flops, CPSFF provides the best performance.



Fig: 7. The suggested flip-flop's timing diagram.

# 4. RESULTS



Fig:8. Existing schematic Single-ended version flip-flop



Fig:9. Existing schematic Differential version flip-flop.



**Fig:10. Schematic of Proposed Conditional Boosting Flipflop.** 



Fig 11: Amplified output waveform.

Averaging 8.1984W, the system's power usage trends were all over the place. During this time, power usage varied from 0.00000W at the lowest to 1.7371W at the highest. A setup time of 0.03 seconds was reported for the system to stabilise, which is in close agreement to the DC operating point. The system's settling into a steady state was indicated by a transitory phase lasting 0.39 seconds, according to further investigation.

| Parameters | Existing | Proposed |
|------------|----------|----------|
|            | method   | method   |
| Min Input  | 4.736 W  | 0.0000 W |
| Power      |          |          |
| Max Input  | 7.6304 W | 1.7371 W |
| power      |          |          |
| Avg Input  | 6.2587 W | 8.1984 W |
| power      |          |          |
| Delay      | 0.95 sec | 0.82 sec |
|            |          |          |

## **4.1 PERFORMANCE COMPARISON:**

## 5. CONCLUSION

The paper introduces sense amplifierbased flip-flops that are dependable, high-performance, and power-efficient. In order to avoid unnecessary transitions and maintain static operation, the proposed conditional bridging adaptively activates the shorting device. Making the shorting device as tiny as feasible may therefore lower the effective capacitance parasite along the timing-critical signal paths. Power consumption and time are both drastically reduced when the locking step is driven directly without glitches conflicts. To optimise power or consumption and space, the singleended variation of the recommended flip-flop uses a revised latching step. Optimal performance and differential operation are achieved by introducing the differential version, which incorporates a differential latching step. The proposed flip-flops are capable of stable operation all the way down to the NTV region, and they also offer superior power and latency performance. A performance analysis employing 18-nm CMOS technology revealed that the proposed flip-flops performed admirably, indicating that they might be valuable in low-power,

high-speed digital applications. For aggressive voltage scaling to the region surrounding the threshold voltage without considerably lowering performance, а new constrained bandgap filter (CBFF) has been presented. This project introduces a pulse trigger FF design that is perfect for low power applications using a boost body driven approach. Last but not least, we provide the concept of a multibit flip-flop combined with a conditional boosting flip-flop, which effectively improves power area and latency.

## **6. REFERENCES**

F. S. Ayatollahi, M. B. Ghaznavi-Ghoushchi, N. Mohammadzadeh, and S. F.
 Ghamkhari, "AMPS: An automated mesochronous pipeline scheduler and design space explorer for high performance digital circuits," IEEE Trans. Circuits Syst.
 I, Reg. Papers, vol. 69, no. 4, pp. 1681–1692, Apr. 2022, doi: 10.1109/TCSI.2021.3138139.

[2] Y. D. Kim, W. Jeong, L. Jung, D. Shin,
J. G. Song, J. Song, H. Kwon, J. Lee, J. Jung, M. Kang, J. Jeong, Y. Kwon, and N.
H. Seong, "A 7 nm highperformance and energy-efficient mobile application processor with tricluster CPUs and a sparsity-aware NPU," in IEEE Int. Solid-

**JNAO** Vol. 15, Issue. 2 : 2024 State Circuits Conf. (ISSCC) Dig. Tech. Papers, San Francisco, CA, USA, Feb. 2020, pp. 48–50, doi: 10.1109/ISSCC19947.2020.9062907.

[3] J. P. Cerqueira, T. J. Repetti, Y. Pu, S. Priyadarshi, M. A. Kim, and M. Seok, "Catena: A near-threshold, sub-0.4-mW, 16-core programmable spatial array accelerator for the ultralow-power mobile and embedded Internet of Things," IEEE J. Solid-State Circuits, vol. 55, no. 8, pp. 2270–2284, Aug. 2020, doi: 10.1109/JSSC.2020.2978137.

[4] S. Jain et al., "A 280 mV-to-1.2 V wideoperating-range IA-32 processor in 32 nm CMOS," in IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers, San Francisco, CA, USA, Feb. 2012, pp. 66–68, doi: 10.1109/ISSCC.2012.6176932.

[5] V. De, S. Vangal, and R. Krishnamurthy,
"Near threshold voltage (NTV) computing: Computing in the dark silicon era," IEEE Des. Test., vol. 34, no. 2, pp. 24–30, Apr. 2017, doi: 10.1109/MDAT.2016.2573593.

[6] C.-R. Huang and L.-Y. Chiou, "An energy-efficient conditional biasing write assist with built-in time-based write-margin-tracking for low-voltage SRAM," IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 29, no. 8, pp. 1586–1590, Aug. 2021, doi: 10.1109/TVLSI.2021.3084041.

221

[7] Y.-W. Kim, J.-S. Kim, J.-W. Kim, and
B.-S. Kong, "CMOS differential logic family with conditional operation for low-power application," IEEE Trans. Circuits
Syst. II, Exp. Briefs, vol. 55, no. 5, pp. 437–441, May 2008, doi: 10.1109/TCSII.2007.914414.

[8] C. Giacomotto, N. Nedovic, and V. G.
Oklobdzija, "The effect of the system specification on the optimal selection of clocked storage elements," IEEE J. Solid-State Circuits, vol. 42, no. 6, pp. 1392–1404, Jun. 2007, doi: 10.1109/JSSC.2007.896516.

[9] J. L. Shin, R. Golla, H. Li, S. Dash, Y. Choi, A. Smith, H. Sathianathan, M. Joshi, H. Park, M. Elgebaly, S. Turullols, S. Kim, R. Masleid, G. K. Konstadinidis, M. J. Doherty, G. Grohoski, and C. McAllister, "The next generation 64b SPARC core in a t4 SoC processor," IEEE J. Solid-State Circuits, vol. 48, no. 1, pp. 82–90, Jan. 2013, doi: 10.1109/JSSC.2012.2223036.

[10] H. McIntyre, S. Arekapudi, E. Busta,
T. Fischer, M. Golden, A. Horiuchi, T.
Meneghini, S. Naffziger, and J. Vinh,
"Design of the two-core x86-64 AMD
'Bulldozer' module in 32 nm SOI CMOS,"
IEEE J. Solid-State Circuits, vol. 47, no. 1,
pp. 164–176, Jan. 2012, doi: 10.1109/JSSC.2011.2167823.

**JNAO** Vol. 15, Issue. 2 : 2024 [11] D. Pan, C. Ma, L. Cheng, and H. Min, "A highly efficient conditional feedthrough pulsed flip-flop for high-speed applications," IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 28, no. 1, pp. 243–251, Jan. 2020, doi: 10.1109/TVLSI.2019.2934899.

[12] M. R. Jan, C. Anantha, and N.
Borivoje, Digital Integrated Circuits— A
Design Perspective. Upper Saddle River,
NJ, USA: Prentice-Hall, 2002.

[13] S. D. Naffziger, G. Colon-Bonet, T.
Fischer, R. Riedlinger, T. J. Sullivan, and T.
Grutkowski, "The implementation of the Itanium 2 microprocessor," IEEE J. Solid-State Circuits, vol. 37, no. 11, pp. 1448– 1460, Nov. 2002, doi: 10.1109/JSSC.2002.803943.

[14] H. Jeong, J. Park, S. C. Song, and S.-O. Jung, "Self-timed pulsed latch for lowvoltage operation with reduced hold time," IEEE J. Solid-State Circuits, vol. 54, no. 8, pp. 2304–2315, Aug. 2019, doi: 10.1109/JSSC.2019.2907774.

[15] G. Shin, M. Jeong, D. Seo, S. Han, and Y. Lee, "A variation-tolerant differential contention-free pulsed latch with wide voltage scalability," in Proc. IEEE Asian Solid-State Circuits Conf. (A-SSCC), Taipei, Taiwan, Nov. 2022, pp. 1–3, doi: 10.1109/A-SSCC56115.2022.9980703.